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Abstract 

The  NIST  Express  Toolkit  is  a software  library  for  building  ExPRESS-related  tools.  This  paper 
gives  an  introduction,  overview,  and  history  of  the  toolkit.  This  paper  also  describes  how  to  get 
more  information  on  the  toolkit.  No  knowledge  of  Express  or  the  Express  Toolkit  is  presumed 
other  than  a rudimentary  grasp  of  basic  computer  science. 
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Context 

The  PDES  (Product  Data  Exchange  using  STEP)  activity  is  the  United  States’  effort  in  support  of 
the  Standard  for  the  Exchange  of  Product  Model  Data  (STEP),  an  emerging  international  standard 
for  the  interchange  of  product  data  between  various  vendors’  CAD/CAM  systems  and  other  man- 
ufacturing-related software  [10].  A National  PDES  Testbed  has  been  established  at  the  National 
Institute  of  Standards  and  Technology  to  provide  testing  and  validation  facilities  for  the  emerging 
standard.  The  Testbed  is  funded  by  the  Computer-aided  Acquisition  and  Logistic  Support 
(CALS)  program  of  the  Office  of  the  Secretary  of  Defense. 

As  part  of  the  testing  effort,  NIST  is  charged  with  providing  software  for  manipulating  STEP 
data.  The  NIST  Express  Toolkit  is  a part  of  this.  The  toolkit  is  an  evolving,  research-oriented  set 
of  software  tools.  This  document  is  one  of  a set  of  reports  ([1]  - [9])  which  describe  various  as- 
pects of  the  Toolkit. 

Introduction 

The  NIST  Express  Toolkit  is  a software  library  for  building  software  tools  for  manipulating  infor- 
mation models1  written  in  the  Express  language  [11].  An  example  application  (“fedex”)  is 
included  which  reports  syntactic  and  semantic  errors  in  Express  schemas. 

Figure  1 shows  the  toolkit  in  context.  The  toolkit  acts  as  a database  for  schema  information  stored 
in  the  file  system.  Using  the  toolkit,  an  application  can  query  for  information  about  the  schema 
such  as  “What  entities  are  defined?”  (Listing  1)  or  “What  are  the  attributes  in  entity 
CURVE?”(Listing  2).  The  application  can  also  manipulate  or  augment  information  in  the  toolkit. 


1.  The  terms  information  model  and  conceptual  schema  are  used  interchangeably  throughout  this  document. 
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Schema 
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Figure  1:  Data  flow  in  the  toolkit 

LISTdo (SCOPEget_entities ( schema) , e, Entity) 
print f (ENTITYget_narae (e) ) ; 

LISTod 

Listing  1:  What  entities  are  defined  in  a schema? 


LISTdo (ENTITYget_at tributes (entity) , v, Variable) 
printf (VARget_name (v) ) ; 

LISTod 

Listing  2:  What  attributes  are  defined  in  an  entity? 


The  actual  in-memory  data  structures  used  by  the  toolkit  are  irrelevant  to  the  application  since  the 
toolkit  provides  a well-defined  interface  for  access  to  all  information.  This  well-defined  interface 
is  a set  of  function  calls  encapsulating  the  information  contained  in  the  schema. 

The  toolkit  allows  tools  to  be  schema-independent.  Different  schemas  can  be  read  at  run-time,  al- 
lowing applications  to  be  flexible  in  the  data  that  they  manipulate.  For  example,  we  have  built  a 
Part  21  exchange  file  parser  [12]  that  works  with  any  Express  schema.  Another  class  of  such  ap- 
plications is  translators.  We  have  built  translators  to  convert  from  Express  to  C++,  SmallTalk, 
and  SQL. 

The  translators,  in  turn,  may  be  used  to  build  other  applications  which  are  schema-dependent.  We 
have  built  schema-dependent  tools  such  as  the  Data  Probe,  prototype  Express  and  SQL  schema 
browsers,  and  data  editors  [13]  [14].  It  is  important  to  recognize  that  similar  applications  while 
schema-dependent,  did  not  actually  require  the  developer  to  write  any  schema-dependent  code. 
Such  code  was  produced  by  the  translators,  and  can  be  reproduced  for  other  schemas  since  the 
translators  themselves  are  schema-independent. 
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The  choice  between  schema-dependent  and  schema-independent  depends  on  several  factors. 
Schema-independent  applications  usually  are  physically  smaller  since  they  do  not  have  entire  in- 
formation models  embedded  within  them.  These  applications  are  also  insulated  against  changes 
in  the  conceptual  schema  and,  to  a certain  extent,  in  Express  itself. 

On  the  other  hand,  schema-dependent  applications  invariably  have  less  overhead  during  run-time 
devoted  to  initialization  of  the  in-memory  representation  of  the  information  model,  and  can  often 
have  reduced  time  for  data  access  as  well. 

We  have  also  constructed  other  toolkits  that  work  with  the  Express  Toolkit.  For  example,  we 
have  built  Exppp,  a toolkit  for  pretty-printing  (i.e.,  formatting)  Express  [15].  This  exists  as  a sep- 
arate toolkit  since  it  represents  just  one  style  of  formatting  and  there  could  conceivably  be  many 
others.  The  Exppp  Toolkit  along  with  the  Express  Toolkit  has  been  used  to  build  several  more 
tools  such  a program  to  convert  STEP  Short  Listings  to  Annotated  Listings  [16],  and  a program  to 
manipulate  Express  within  a Tcl/Tk  environment  [17]. 

Like  tools,  toolkits  can  similarly  be  either  schema-dependent  or  schema-independent.  The  Exppp 
toolkit  is  schema-independent.  Another  schema-independent  toolkit  is  the  STEP  Class  Library 
(SCL)  Toolkit  [18]  which  provides  support  for  manipulating  Express  inside  a C++  environment. 
In  contrast,  the  Part  21  Exchange  File  Toolkit  [12]  is  schema-independent. 

Environment 

The  Express  Toolkit  was  developed  on  Sun  Microsystems  SPARCstation  workstations  running 
SunOS,  an  operating  system  derived  from  BSD  UNIX.2  Occasionally,  we  or  others  have  ported 
the  software  to  other  platforms  such  as  Digital’s  DECstation  and  Hewlet-Packard’s  HP700  and 
800-series  workstations.  While  some  small  non-portabilities  invariably  creep  in,  the  system  is 
highly  portable  within  a UNIX  environment.  While  we  have  not  tried  doing  so,  we  believe  the 
software  will  port  easily  to  any  pure-POSIX  [19]  environment.  With  minor  limitations  due  to  file 
system  deficiencies,  the  software  should  be  portable  to  a PC-based  platform.3 

The  Toolkit  is  written  in  ANSI  Standard  C [20].  The  use  of  prototypes  prevents  its  use  in  pure 
K&R  [21]  environments  although  this  could  be  remedied  by  code  rewriting  tools.  While  any 
ANSI  C compiler  can  be  used  to  compile  the  toolkit,  we  use  GCC,  available  from  the  Free  Soft- 
ware Foundation  (FSF)  [22].  FSF  tools  are  free  but  with  certain  distribution  restrictions.  If  GCC 
is  used  to  compile  the  toolkit,  certain  optimizations  are  enabled  which  can  increase  its  perfor- 
mance. 

The  toolkit’s  scanner  is  written  in  Lex  [23],  a scanner  generator  commonly  provided  in  UNIX  en- 
vironments. However,  it  can  also  be  processed  by  Flex,  another  scanner  generator  that  is  in  the 
public-domain  and  quite  popular.  The  toolkit’s  parser  is  written  in  Yacc  [24]  (a  parser  generator 
commonly  provided  in  UNIX  environments)  with  special  modifications  [25]  to  enhance  error  re- 
porting. However,  it  can  also  be  processed  by  Bison,  another  parser  generator  that  is  available 
from  FSF.  While  not  built  in  to  UNIX  systems,  Flex  and  Bison  are  more  flexible  than  Lex  and 


2.  Trade  names  and  company  products  are  mentioned  in  the  text  in  order  to  adequately  specify  experimental  pro- 
cedures and  equipment  used.  In  no  case  does  such  identification  imply  recommendation  or  endorsement  by  the 
National  Institute  of  Standards  and  Technology,  nor  does  it  imply  that  the  products  are  necessarily  the  best  avail- 
able for  the  purpose. 

3.  Previous  releases  were  dependent  upon  a POSIX.2  environment.  We  have  removed  these  dependencies. 
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Yacc,  and  they  have  a reputation  of  being  faster  as  well.  Although  we  have  not  benchmarked 
them,  we  use  Flex  and  Bison  in  our  own  development  workbench  and  encourage  their  use  in  the 
toolkit. 

Performance 

Performance  has  not  been  objectively  studied,  however  we  can  say  something  about  it  nonethe- 
less. The  performance  of  the  toolkit  has  been  significantly  improved  from  earlier  releases.  The 
current  release  runs  in  less  than  1%  of  the  time  than  the  previous  release  [26]  and  uses  60%  less 
space,  while  at  the  same  time  semantically  analyzing  more  of  the  information  in  the  Express 
specification  than  before.  The  area  of  performance  is  further  described  in  [3]. 

While  there  is  no  standard  schema  or  platform  with  which  we  can  characterize  performance,  some 
simple  statistics  are  possible.  On  a Sun  SPARCstation  2,  a 100Kb  schema  takes  on  the  order  of  1 
to  2 seconds  to  process  including  reporting  any  syntactic  or  semantic  errors.  The  toolkit  is  ap- 
proximately 14000  lines  of  C (including  comments)  which  compiles  in  2 minutes  using  GCC. 

How  to  Obtain  the  Toolkit 

The  toolkit  and  its  documentation  may  be  obtained  in  a variety  of  ways.  The  simplest  way  is 
through  anonymous  ftp  via  the  Internet.  In  this  case,  the  source  is  pub/step/npttools/exptk.tar.Z 
on  ftp.cme.nist.gov.  Complete  documentation  on  obtaining  the  toolkit  and  its  documentation  is  in 
/pub/step/ntpdocs/exptk-obtaining-installing.ps.Z  [5]. 

Alternately,  it  is  possible  to  receive  the  toolkit  by  email.  To  do  this,  send  the  following  mail  to 
ntpserver@cme.nist.gov: 

send  step/npttools/exptk. tar . Z 

send  step/nptdocs/exptk-obtaining-installing.ps . Z 
If  you  do  not  understand  these  instructions  or  for  any  other  reason  cannot  successfully  use  ftp  or 
email,  contact: 

FASD  - National  PDES  Testbed 
National  Institute  of  Technology  and  Stanoards 
Bldg  220,  Rm  A- 127 
Gaithersburg,  MD  20899 

npt-info@cme.nist.gov 

1-301-975-3179 

Questions,  Problems,  and  Support 

The  system  is  distributed  in  source  form  and  you  are  encouraged  to  experiment  with  the  system, 
especially  if  you  have  problems  with  it.  While  it  is  often  quicker  for  you  to  have  us  diagnose  your 
problems,  it  is  quicker  for  us  to  have  you  diagnose  your  own  problems.  This  software  is  a proto- 
type, intended  to  spur  development  of  commercial  products. 

Nonetheless,  if  you  do  have  questions  and/or  problems,  you  may  send  e-mail  to  the  following  ad- 
dresses. Please  include  schemas,  version  numbers,  platform  descriptions,  and  any  other 
information  that  could  be  relevant. 
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Annotated  Listing  Generator  shtolo@cme . nist . gov 


Data  Probe 
Express  Analysis 
Express  Server 
Part  21  Analysis 
STEP  Class  Library 


p21tk@cme .nist . gov 
stepcl@cme . nist . gov 


dprobe@cme . nist . gov 
exptk@cme . nist . gov 


express- server- admin@cme . nist . gov 


History  and  Credits 

The  idea  of  a schema-independent  toolkit  was  first  proposed  by  Steve  Clark  (NIST).  Clark  wrote 
the  initial  release  of  the  toolkit.  Written  in  C,  it  was  a non-object-oriented  implementation  charac- 
terized by  a single,  single-pass-resolution  phase.  It  was  based  on  the  “Tokyo”  draft  of  Express. 

After  attempting  a short-lived  version  in  C++,  Clark  rewrote  it  in  C but  with  a hand-built  object- 
oriented  engine  for  N496  [27].  This  was  publicly  released  in  1988  and  saw  wide  distribution. 

Around  the  same  time,  Bruce  Thomas  (NIST)  created  a similar  toolkit  for  what  was  to  become  the 
STEP  Part  21  Exchange  File  Format.  Several  other  NIST  employees  worked  on  this  including 
Sandy  Ressler,  Tina  Lee,  and  Cathy  Diaz.  Clark  eventually  took  over  control  of  this  software,  in- 
tegrating it  into  a framework  similar  to  the  Express  toolkit.  Using  both  toolkits,  Clark  wrote  the 
first  application,  an  Express  to  Smalltalk  translator  [28].  In  the  following  year,  numerous  appli- 
cations appeared,  including  an  Express  to  SQL  translator  and  an  Express  to  C++  translator,  both 
written  by  KC  Morris  (NIST). 

In  1989,  Clark  began  participating  in  the  Express  standards  committee,  while  relinquishing  fur- 
ther software  development  to  Don  Libes  (NIST).  Libes  worked  on  speeding  up  the 
implementation  primarily  by  reimplementing  symbol  tables  with  hash  tables  instead  of  linked 
lists.  In  September  1990,  based  on  N496,  this  release  was  distributed  to  NIST’s  PDES,  Inc.  part- 
ners but  was  not  made  publicly  available. 

Libes  then  enhanced  the  software  so  that  it  supported  N14.  Dave  Briggs  (Boeing  and  PDES,  Inc.) 
contributed  the  implementation  of  Use  and  Reference.  This  implementation  was  publicly  distrib- 
uted in  November  1991. 

Up  to  this  point,  this  and  other  software  was  collectively  known  as  the  “NIST  PDES  toolkit”.  As 
different  rates  of  revision  in  various  standards  caused  pieces  of  the  software  to  be  revised  sepa- 
rately, the  PDES  toolkit  was  broken  up  into  a number  of  toolkits,  such  as  the  Express  Toolkit. 

During  1992,  Libes  rewrote  the  Express  Toolkit  while  ostensibly  converting  it  from  N14  to  N151 
(Draft  International  Standard).  In  the  interests  of  efficiency,  the  object-oriented  engine  was  re- 
moved, and  the  single-pass  resolution  was  converted  to  multiple  passes.  This  software  ran  over 
100  times  faster  than  the  earlier  object-oriented  releases.  In  addition,  many  of  the  missing  fea- 
tures of  Express  were  finally  implemented.  This  is  described  further  in  [3]. 

The  authors  thank  the  numerous  testers  and  application  writers  who  put  up  with  continual  “im- 
provements” to  the  toolkit,  and  who  gave  high-quality  feedback.  Thanks  particularly  to  Jim 
Fowler,  KC  Morris,  Kent  Shepherd,  Gerard  Silvemale,  Tom  Kramer,  Kent  Reed,  Gerard  Silver- 
nail,  Connie  Augustin,  Newton  Breese,  Cita  Furlani,  Lisa  Phillips,  Dave  Sauder,  Mary  Mitchell, 
Peter  Carr,  David  Helfrick,  Sandy  Ressler,  Jeane  Ford  and  numerous  other  people  who  contribut- 
ed requirements,  suggestions,  and  bug  fixes. 
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The  NIST  Express  Toolkit  is  funded  by  the  Computer-aided  Acquisition  and  Logistic  Support 
(CALS)  program  of  the  Office  of  the  Secretary  of  Defense  (see  Context  on  page  1). 

Documentation 

The  following  papers  describe  various  aspects  of  the  toolkit.  Note  that  the  implementation  is  sub- 
ject to  change.  Because  of  this,  no  guarantee  is  made  that  the  descriptions  in  this  paper  are  still 
accurate  with  regard  to  the  current  implementation. 
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